Building of Networks of Natural Hierarchies of Terms Based on Analysis of Texts Corpora

نویسنده

  • D. V. Lande
چکیده

The method of building a network of natural terms hierarchy is proposed which may be regarded as "quasiontology", i.e. the basis for corresponding terminological ontology formation. Natural terms hierarchy network of is based on «significantly informative» text elements, the reference words and phrases. The methodology to identify such terms is given in [1, 2]. The use of such elements can form search images and cover the whole knowledge bases for the further common ontology construction. Reference words and phrases for natural terms hierarchy construction are selected with taking into account the discriminant power. However, one of the properties is not sufficient for the construction of thesauruses and ontologies. Sometimes words with low discriminant power, in particular, the most frequent words of the given subject area (e.g., "Information", "Retrieval", "Search" words in the information retrieval body) are essential for a task that is considered.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building domain specific lexical hierarchies from corpora

In this article, we present a new algorithm for building domain specific lexical hierarchies from texts. The basic elements of such a hierarchy are the normalized terms – mono and multi-word terms – extracted from a large corpus by a terminological extractor. The algorithm relies on collocations for representing the meaning of these terms, finding hierarchical relations between them and finally...

متن کامل

Comparative Study of the Academic Vocabulary Content of Electronic Engi-neering Corpora, GE Materials and M.S. Entrance Examinations

The importance of vocabulary learning has been underlined in the field of English for Academic Purposes (EAP) because non-English majors who require reading English texts in their fields of study have to expand their English vocabulary knowledge much more efficiently than ordinary ESL/EFL learners. Since academic vocabulary instruction in Iranian universities is realized through the use of Gene...

متن کامل

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...

متن کامل

Ontologies, Taxonomies, Thesauri: Learning from Texts

The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three major steps in...

متن کامل

The Genre of Landscape and Building-Painting (The Artistic) and the Discourse of Nationalism (The Political), during the Late Qajar and Early Pahlavi Periods an Analysis Based on “Mediation” and “Totality” in Methodology of Georg Lukács

Landscape and building-painting as a genre is one of the many features of Iranian painting in the last years of the Qajar and the first years of the Pahlavi era. This paper explains the relation between these painting as the particular, following the domination of the discourse of nationalism as the general. To reason this idea and to explain the relationship between these two, Georg Lukács the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1405.6068  شماره 

صفحات  -

تاریخ انتشار 2014